Large-Scale Multi-Label Learning with Incomplete Label Assignments

نویسندگان

Xiangnan Kong

Zhaoming Wu

Li-Jia Li

Ruofei Zhang

Philip S. Yu

Hang Wu

Wei Fan

چکیده

Multi-label learning deals with the classification problems where each instance can be assigned with multiple labels simultaneously. Conventional multi-label learning approaches mainly focus on exploiting label correlations. It is usually assumed, explicitly or implicitly, that the label sets for training instances are fully labeled without any missing labels. However, in many real-world multi-label datasets, the label assignments for training instances can be incomplete. Some groundtruth labels can be missed by the labeler from the label set. This problem is especially typical when the number instances is very large, and the labeling cost is very high, which makes it almost impossible to get a fully labeled training set. In this paper, we study the problem of large-scale multi-label learning with incomplete label assignments. We propose an approach, called Mpu, based upon positive and unlabeled stochastic gradient descent and stacked models. Unlike prior works, our method can effectively and efficiently consider missing labels and label correlations simultaneously, and is very scalable, that has linear time complexities over the size of the data. Extensive experiments on two real-world multi-label datasets show that our Mpu model consistently outperform other commonly-used baselines.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Large-scale Semi-supervised Multi-label Classifier Capable of Handling Missing labels

Multi-label classification has received considerable interest in recent years. Multi-label classifiers have to address many problems including: handling large-scale datasets with many instances and a large set of labels, compensating missing label assignments in the training set, considering correlations between labels, as well as exploiting unlabeled data to improve prediction performance. To ...

متن کامل

Enhancing multi-label classification by modeling dependencies among labels

In this paper, we propose a novel framework for multi-label classification, which directly models the dependencies among labels using a Bayesian network. Each node of the Bayesian network represents a label, and the links and conditional probabilities capture the probabilistic dependencies among multiple labels. We employ our Bayesian network structure learning method, which guarantees to find ...

متن کامل

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...

متن کامل

Semi-Supervised Multi-Label Learning with Incomplete Labels

The problem of incomplete labels is frequently encountered in many application domains where the training labels are obtained via crowd-sourcing. The label incompleteness significantly increases the difficulty of acquiring accurate multi-label prediction models. In this paper, we propose a novel semi-supervised multi-label method that integrates low-rank label matrix recovery into the manifold ...

متن کامل

Multiple Kernel and Multi-label Learning for Image Categorization

MULTIPLE KERNEL AND MULTI-LABEL LEARNING FOR IMAGE CATEGORIZATION By Serhat Selçuk Bucak One crucial step in recovering useful information from large image collections is image categorization. The goal of image categorization is to find the relevant labels for a given image from a closed set of labels. Despite the huge interest and significant contributions by the research community, there rema...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Large-Scale Multi-Label Learning with Incomplete Label Assignments

نویسندگان

چکیده

منابع مشابه

An Efficient Large-scale Semi-supervised Multi-label Classifier Capable of Handling Missing labels

Enhancing multi-label classification by modeling dependencies among labels

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Semi-Supervised Multi-Label Learning with Incomplete Labels

Multiple Kernel and Multi-label Learning for Image Categorization

عنوان ژورنال:

اشتراک گذاری